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Bayesian  Market  Socialism  With  Costly  Iteration 


by 


Michael  Spagat* 
University  of  Illinois 
Champaign,  Illinois   61820 


Abstract 

We  present  an  iterative  planning  procedure  with  Bayesian  learning 
and  costly  iteration.   We  derive  the  optimal  search  procedure  for 
finding  a  production  plan.   At  the  end  some  examples  are  worked  out 
and  the  qualitative  properties  of  the  procedure  are  studied. 


I  would  like  to  thank  Rolf  Mantel  for  many  lengthy  discussions  and 
many  suggestions  that  made  this  paper  possible.   I  also  thank  Andrew 
Caplin,  Brad  Delong,  Rick  Erickson,  Andreu  Mas-Colell,  Klaus  Nehring, 
and  Doug  Tygar  for  stimulating  discussions. 


1.   Introduction 

The  theory  of  market  socialism  has  a  very  old  and  venerable 
tradition  in  the  comparative  economics  literature  stretching  back  to 
Lange  (1938),  Lerner  (1944),  Hayek  (1935)  and  others  in  the  1920 fs  and 
1930  's.   This  literature  in  turn  has  direct  antecedents  in  the 
writings  of  Walras  and  Pareto. 

In  the  contemporary  theory  of  economic  planning  there  have  been 
numerous  studies  of  iterative  multi-level  planning  processes  that 
build  on  or  depart  from  the  Lange-Lerner-type  proposals.   Examples 
include  Arrow  and  Hurwicz  (1960),  Kornai  and  Liptak  (1965),  Malinvaud 
(1967),  Heal  (1969),  Weitzman  (1970),  Cremer  (1977  &  1983)  and  Henry 
and  Zylberberg  (1978).    In  these  algorithms  the  central  planner  makes 
some  type  of  announcement  (for  example,  prices  in  the  Lange  procedure 
and  preliminary  output  targets  in  Weitzman' s)  and  then  firms  respond 
(with  profit  maximizing  input-output  combinations  in  Lange's  procedure 
and  marginal  rates  of  substitution  in  Weitzman's).   The  central 
planner  then  adjusts  his  plan  and  the  procedure  is  repeated. 

The  basic  type  of  result  achieved  in  the  literature  is  that  a 
proposed  algorithm  will  converge  to  an  optimal  production  plan  with 
sufficiently  many  iterations.   Some  researchers  are  able  to  prove 
additional  results  such  as  that  their  algorithm  leads  to  progressively 
better  production  plans  or  that  it  will  converge  after  a  finite  number 
of  iterations. 

There  are  a  number  of  objections  that  can  be  raised  to  this 

approach  as  it  stands  as  either  a  positive  or  normative  theory  of 

2 
central  planning.    One  of  the  most  important  problems  is  that  these 
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procedures  require  that  firms  truthfully  report  the  appropriate  infor- 
mation to  planners  when  it  may  be  in  their  interest  to  manipulate  the 
system.   But  an  equally  important  problem,  we  believe,  is  that  these 
algorithms  focus  attention  on  long-run  results  (convergence  after  a 

large  number  of  iterations)  when  in  the  practice  of  planning,  there  is 

3 
generally  only  time  for  a  few  iterations.    Below  we  present  a  model 

which  addresses  the  latter  problem  while  offering  no  solution  to  the 

former  problem. 

In  our  model  there  is  an  iterative  procedure  that  allows  the 
central  planner  to  learn  about  the  production  possibilities  in  the 
economy  but  iteration  is  costly,  therefore  the  planner  would  typically 
not  iterate  until  he  knew  a  fully  optimal  plan.   An  additional  feature 
of  the  model  is  that  the  learning  process  is  Bayesian  so  that  the 
planner  has  a  prior  probability  distribution  over  the  possible  tech- 
nologies for  the  economy  which  is  updated  in  a  Bayesian  fashion. 

The  information  exchanged  between  the  central  planner  and  sub- 
ordinates is  quite  crude  in  this  model.   The  planner  proposes  a  plan 
and  learns  only  whether  or  not  it  is  feasible.   While  we  believe  that 
the  existing  models  in  the  literature  presume  the  exchange  of  infor- 
mation to  be  too  sophisticated,  the  learning  process  here  is  probably 
too  simple  and  it  would  be  interesting  to  study  more  complicated 
exchanges  of  information  in  the  future. 

But  the  strongest  simplifying  assumption  that  we  are  forced  to 
make  here  is  to  consider  only  one  dimensional  technologies.   The  best 
way  to  think  of  the  technology  is  to  consider  a  firm  with  all  its 
inputs  fixed  producing  a  single  output.   The  problem  of  the  planner  is 
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to  try  to  determine  the  maximum  feasible  output  for  the  firm  and  then 
set  a  plan. 

To  put  this  model  on  the  same  plane  as  the  multi-level  planning 
literature  it  will  be  necessary  to  generalize  to  n  dimensions  so  this 
paper  is  only  a  beginning.   But  it  does  point  in  some  directions  which 
we  believe  are  an  improvement  on  the  existing  literature.   In 
particular,  it  incorporates  Bayesian  learning,  costly  iteration  and 
more  limited  communications  possibilities.   It  is  hoped  that  it  will 
stimulate  fruitful  research  in  the  future. 

2.   Statement  of  the  Problem 

Consider  the  closed  interval  [a,b].   Interpet  [a,b]  as  a  set  of 
possible  production  plans.   Let  g  C  [a,b]  be  a  closed  interval 
containing  a  which  is  interpreted  as  the  feasible  set.   So  there 
exists  x  E  [a,b]  such  that  y  e  g  iff  a  jC  y  _<  x.   P{(x,y)}  is  the 
subjective  probability  that  x  e  (x,y).   Let  R  denote  the  real  numbers 
and  define  F:R  -»■  [0,1]  by  F(x)  =  P{(x,b)}  for  each  x  e  R  and  let  F 
have  a  continuous  and  strictly  negative  first  derivative  on  [a,b]. 
Note  that  F(*)  is  not  the  distribution  function  for  P{*}  but  one  minus 
this  distribution  function. 

Let  U:RxR  +  R  be  a  Von-Neumann-Morgenstern  utility  function  of  the 
form  U(r  ,r  )  =  u(r  )-r  ,  where  U'  exists,  is  continuous,  and  greater 
than  zero  everywhere.   Its  first  argument  is  interpreted  as  a  produc- 
tion plan,  and  its  second  argument  is  interpreted  as  the  cost  incurred 
in  finding  out  if  the  production  plan  is  feasible.   The  central 
planner  can  choose  any  point  x  e  R,  and  learn  either  that  x  e  g  or 
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x  i   g.   If  x  e  g,  we  know  x  _>_  x.   But  there  is  a  cost  c  of  receiving 

this  information. 

Let  t  be  the  set  of  all  finite  binary  trees  (trees  where  each 

nonterminal  node  has  exactly  two  successors).   Let  n:t  +  N  be  a 

function  where  n(t)  is  the  number  of  terminal  nodes  of  t  for  each  teT 

and  N  denotes  the  positive  integers.   A  search  procedure  s  on  [a,b]  is 

a  tree  teT,  and  a  vector  y  with  n(t)+l  entries  given  by  a  =  y  < 

y   ...  <  y  ,  .   .   The  terminal  nodes  of  t  are  labelled  from  left  to 
Z        n(  t  )+l 

right  by  y. ,  •••,  y  ,^s»      Let  Sr    ,  be  the  set  of  all  search  proce- 
1        n(t)        [a,b]  r 

dures  (t,y)  on  [a,b]. 

Look  at  Figure  One  for  the  interpretation  of  an  s  e  S r   .  ,  as  a 

la,bj 

search  procedure.   The  initial  node  corresponds  to  the  question,  "is 
y   feasible?"   If  not,  then  x  <  y  .   Then  the  next  question  is,  "is 
y,  feasible?"   If  it  is,  then  y ,  _<  x  <  y,.  and  the  search  terminates, 
costing  3c. 

We  do  not  need  to  assume  the  search  procedure  is  finite.   We  can 
let  it  be  infinite  in  principle.   But  allowing  infinite  search  would 
cause  us  some  notational  difficulties  and  since  we  show  the  optimality 
of  finite  search  in  Theorem  1,  it  seems  to  be  worthless  to  accommodate 
the  infinite  case  notationally. 

oo 

Define  q:t  +   UN,  ...,  xN.  so  that  q(t)  is  a  vector  of  length 
1=2  1  X 

n(t),  where  q.(t)  is  the  path  length  to  the  ith  terminal  node  (reading 

from  left  to  right)  in  tree  t  for  1=1,  ...,  n(t).   Define  C:Sr   ,  ,xR  ■*■ 

la,b  J 

R  by, 
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c,  M    {ew^e7 


FIGURE   ONE 


n(t) 
C((t,y),c)  =  c  Z        qi(t)(F(yi)  -  F(yi+1>) 
i=l 


This  is  the  expected  cost  of  operating  the  search  procedure  when  the 
cost  of  asking  questions  is  c.   For  the  search  procedure  in  Figure  One 
the  expected  operating  cost  is, 

c[4P(a   <_  x   <  y2)    +  4P(y2   <.  x   <  y3)    +   3P(y3   <_  x   <  y4)    +   ...    +  4p(yn  _<   x   <   b) 

The  planner  wishes  to  choose  a  production  plan,  but  is  constrained  to 
pick  one  that  is  feasible  with  probability  one.   By  operating  procedure 
(t,y),  he  will  choose  y  (t)  with  probability  F(y  )  -  F(y   )  for 
i=l,  ...,  n(t),  but  we  will  have  to  pay  for  the  operation  of  procedure 
(t,y).   For  the  fixed  c,  the  problem  is: 

n(t) 
(1)        max   E   [u(y.)  •  (F(y.)  -FCy))]  -C(s,c) 

sr  HI  i=1     * 
[a,b] 

Remark  1:   n(t)  is  not  an  independent  choice  variable  because  it 

is  determined  by  the  (t,y)  e  Sr   ,  ,  which  is  chosen. 

[a,b] 

Remark  2:   We  can  assume  [a,b]  =  [0,1]  by  setting  F  (x)  = 

F(a+x(b-a))  and  u  (x)  =  u(a+x(b-a))  for  each  x  e  R.   Furthermore,  we 

can  normalize  so  that  u(0)  =  0  and  u(l)  =  1.   We  can  now  call 

S  r   .  ,  simply  S. 
[a,bj 

3.   The  Hu-Tucker  Algorithm 

T.  C.  Hu  and  A.  C.  Tucker  [4]  have  given  a  way  of  constructing  a 

tree  that  solves  a  simpler  problem  than  (1).   In  their  problem,  we 

start  with  an  ordered  set  of  terminal  nodes  y  ,  •••,  y  where  each 

1        n 
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V    1 


7^ 


FIGURE    TWO 
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node  y.  has  a  weight  w.  which  can  be  interpreted  as  its  probability. 

Let  T(y  ,  ...,  y  )  be  the  set  of  all  binary  trees  with  n  terminal 

nodes  labelled  y  ,  ...,  y  in  left  to  right  order.   In  Figure  Two,  the 
1        n 

tree  (a)  is  in  T(y  ,  y  ,  y  ,  y  )  and  the  tree  (b)  is  not.   They  give 
an  algorithm  to  construct  the  tree  that  solves: 


n 
(2)  min       I   q.(t)w 

T(y.,  ....  y  )  i=l  X 
1        n 

The  algorithm  is  presented  in  the  appendex  for  completeness. 
While  there  is  some  economic  intuition  behind  it,  the  procedure  is 
best  viewed  as  a  technical  trick,  but  1  emphasize  that  this  algorithm 
can  easily  be  executed  on  a  computer  so  everything  that  follows  should 
be  considered  quite  computable. 

For  each  n  €  N,  let 


A  =  {(w.,  •..,  w  ):  w.  >  0,  1=1,  ...,  n,   £  w.  =  1} 
1        n    l  .  n  i 

i=l 


i.e.,  the  interior  of  the  n  diraensinal  unit  simplex.   Let  A  =  U   A  • 

neN  n 
Let  t*:  A  ■»■  t  give  the  tree  constructed  by  the  Hu-Tucker  algorithm  for 

each  vector  of  weights  on  terminal  nodes  in  A.   Gilbert  and  Moore  [2] 

have  shown  that  for  each  ceR  and  (w, ,  ...,  w  )  eA,  the  expected  cost 

1        n 

of  operating  the  Hu-Tucker  tree  is  between 


n  n 

c   E   w.ln( — )  and  c  I    (w.ln( — ))  +  2c. 
x        w  i   w 

i=l       i        i=l       i 
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The  lower  bound  is  the  well-known  entropy  formula.   Gilbert  and 
Moore's  original  contribution  was  the  upper  bound. 

4.   Solution  of  the  Problem 

It  is  crucial  to  prove  first  that  we  would  never  wish  to  use  a 
search  procedure  that  may  ask  an  infinite  number  of  questions  (have  an 
infinite  path  length)  with  positive  probability.   This  will  show  that 
we  lose  nothing  in  restricting  our  attention  to  finite  trees. 

Theorem  1;   Given  continuously  dif f erentiable  functions  u(*)  and 
F(»),  there  exists  an  upper  bound  on  the  number  of  terminal  nodes  and 
hence,  on  path  lengths  for  an  optimum  search  procedure,  if  one  exists. 

Proof :   The  idea  is  simple.   Any  infinite  search  procedure  on  a 
compact  set  must  make  arbitrarily  fine  distinctions  with  positive 
probability.   We  show  that  since  the  potential  gains  are  bounded,  it 
becomes  counterproductive  to  keep  searching. 

Since  u'  is  continuous  there  exists  A.  >  0  and  B  >  0  such  u'(x)  <  A 
and  F'(x)  >  -B  for  x  e  [0,1].   Consider  [a,b]  C  [0,1]  of  length  6. 
Note 

max    (u(x)  -  u(a))P{x<x|xe[a,b] }  <  A6B6. 
xe  [a,b] 

Q 

If  6  _<  +  /—  then  any  (t,y)  e  S  with  y.  =  a  <  y .    <  b  =  y.    for  some 
Ad  i         i +1         i+z 

i  between  1  and  n(t)-l  cannot  be  optimal,  because  if  you  delete  y.   : 
1)  in  the  event  that  x  e  [a,b]  at  least  c  in  search  cost  is  saved  and 
utility  is  lowered  by  less  than  c:   2)  in  the  event  x  i    [a,b]  no 
utility  is  lost  and  c  in  search  cost  is  saved. 
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This  proves  that  if  an  optimum  search  procedure  exists,  it  must 

have  fewer  than  1// —  terminal  nodes. 

AB 

0 

Any  finite  search  procedure  (t,y)  e  S  determines  which  element  of 

the  partition  {[y  ,y0),  [y0,y^),  •••,  ly  ,„*'?   ,„*.,]}  contains  x. 
l   ^     i.     -J  n^t;   n(.t,)"t~i 

Let  Yn  =  {(y   y^  ....  y   ):   yx  -  0,  y^  -  1,  y£  >  y.^  for 

00 

i  =2,  3,  ...,  n+1}  for  n  =  2,  3,  ...  .   Let  Y  =  U  Y  .   Define 

1=2  n 
w:Y  ■»>  A  (the  closure  of  A)  by  w(y)  -  (F(y  )  -  F(y2>,  F(y  )  -  F(y  ), 


...,  F(y  )  -  F(y  L1)).   Fix  c  >  0.   If  (t,y)  solves  (1),  then 
n       n+1 

t=t*(w(y)),  i.e.,  we  must  be  using  the  Hu-Tucker  tree. 

n(t)  i 

(3)      C((t*(w(y)),y),c)  -  c   E   w  (y)ln  (— r-r)  +  6(y,c) 

i-i     1  ryj 

where  0  <^  6  _<  2c,  by  the  Gilbert  and  Moore  result  at  the  end  of 
Section  Three.   Note  that  6  depends  only  on  P,  y,  and  c,  because  the 
Hu-Tuck.er  algorithm  depends  only  on  the  probabilities  of  the  terminal 
nodes. 

Keeping  c  fixed,  consider  the  problem, 


7-1 


I 
"i 


(4)      max    E   [u(y.)w  (y)-cw.(y)ln  ( j—r)]    -  6(y,c) 

\r        --_i      ii'     i       w.  iy; 


y  must  solve  (4)  and  its  solutions  are  exactly  the  same  as  the 

solutions  to, 

n 

(5)      max  max   £  { [u(y . )+cln(w  (y ) ) ]w. (y ) }-6(y ,c ) . 

n  Y   i-1     ±  L      x 

n 

Also,  if  y  solves  (5),  then  (t*(w(y),y)  solves  (1). 
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Remark  3:   To  make  (4)  and  (5)  well-defined  problems,  define 
-«>*0  =  0.   If  w.(y)  =  0  for  some  i  Chen  cln(w.(y))  *  w.(y)  =  0.   One 
can  use  L'hopital's  rule  to  make  sure  that  this  assignment  preserves 
continuity  of 

n 

(*)       E  [u(y.)  +  cln(w.(y))]w.(y) 
.  ,     i         i      i 
i=l 

over  Y  for  each  n. 
n 

Lemma  1:   The  maximand  in  (4)  (denoted  here  as  (*))  is  a 
continuous  function  for  any  c  >  0  and  y  e  Y. 
The  proof  is  very  simple  but  omitted. 

Note  that  if  a  solution  (5)  exists  (call  it  y),  then  we  can  assume 

y. .,  >  y.  for  each  i  because  if  y.  ,  =  y.  we  can  just  delete  y. ,,  and 
3  l+l   Ji  i+1    i  l+l 

the  maximand  will  retain  the  same  value. 

Theorem  2:   A  solution  to  (5)  exists. 

Proof:   For  each  n  a  solution  to 

max  [u(y.)  +  cln(w.(y))]  w.(y)  -  6(y,c) 
Y      i        i       i 

n 

exists.   It  is  simply  a  matter  of  maximizing  a  continuous  function 
over  a  compact  set.   We  only  need  to  show  that  the  value  of  this 
problem  does  not  increase  indefinitely  as  n  grows. 

But  Theorem  1  tells  us  that  there  is  an  upper  bound  n*  on  the 
number  of  terminal  nodes  that  an  optimal  search  procedure  can  possibly 
have.   So  we  only  have  to  solve 
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n 
max  max  I   u(y.)  +  cln(w. (y ))w. (y )  +  S(y,c) 

n<n*  Y   i=l   x  x  X 

n 

which  has  a  solution. 

a 

Let  y:  R  +  Y  be  a  function  that  selects  a  solution  to  (5)  for 
each  positive  number  c.   Then  (t*(w(y(c) )) ,y(c) )eS  is  a  solution  to 
(1)  for  each  c.   In  short,  to  solve  (1),  the  planner  should  first 
solve  (5),  and  then  apply  the  Hu-Tucker  algorithm  to  the  partition  of 
[0,1]  generated  by  the  solution.   The  problem  is  that  since  we  do  not 
know  the  function  6(y,c),  we  cannot  specify  (5). 

From  Theorem  1,  we  know  that  for  each  c  >  0  there  exists  n*(c) 
such  that  the  optimal  search  procedure  cannot  have  more  than  n*(c) 
terminal  nodes.   The  problem 

n 
(6)        max   max  T.    [u(y.)+cln  w.(y)]w.(y), 

n<n*(c)  Y   i=l    L       X     X 

—  n 

must  have  a  solution,  because  for  each  n  we  only  need  to  maximize  a 

continuous  function  over  a  compact  set. 

+ 
Fix  the  function  n*(c)  and  let  z:R  +   Y  be  a  function  that  selects 

a  solution  to  (6)  for  each  c  >  0.   This  z  is  an  approximate  solution 

to  (5).   In  fact,  the  payoff  from  operating  search  procedure 

(t*(w(z(c) ))) ,  z(c))  is  within  2c  of  the  payoff  from  operating  procedure 

(t*(w(y(c) ) ) ,y(c) )  by  the  result  of  Gilbert  and  Moore.   So,  if  c  is 

small  it  seems  acceptable  to  solve  problem  (6),  rather  than  (5)  and 

then  apply  the  Hu-Tucker  algorithm  to  construct  a  search  procedure. 

We  do  not  lose  more  than  2c  this  way. 


-13- 


Let  v  (c)  be  Che  value  of  the  optimal  search  procedure  when  the 

unit  search  cost  is  c  and  let  v  (c)  be  the  value  of  the  procedure 

z 

(t*(w(z(c) )) ,z(c) ).   Since  2c  might  be  large  relative  to  the  value  of 
the  problem,  we  define  a  relative  error  function 


v  (c)  -  v  (c) 

;(c)  „_X * 

v  (c; 

y 


v  (c)  must  be  decreasing  in  c  so  this  expression  must  become  small  as 

y 

c  converges  to  zero.   Also,  we  know 

•(c)  <v^fy^  t(c) 

Z 


If  t(c)  is  a  large  positive  number  then  our  approximation  is  bad.   If 

t(c)  is  a  small  positive  number,  then  we  know  our  approximation  is 

good. 

Our  final  result  is  that  as  c  becomes  small,  we  can  capture  the 

entire  potential  value  of  the  problem  at  a  tiny  cost. 

1 
Theorem  3:   As  c  +  0  the  value  of  (1)  tends  to  /  u(x)F'(x)dx. 

0 

Proof :   Consider  a  sequence  of  unit  costs  such  that 


n 
c   = 


nln(n) 


The  strategy  of  the  proof  is  to  pick  a  partition  of  the  interval 
for  each  c  that  gives  equal  probability  weight  to  each  subinterval. 
This  will  not  be  the  optimal  partition  but  we  will  show  that  the 
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1 
payoffs  that  result  from  these  partitions  will  approach  /  u(x)F'(x)dx 

0 
as  n  tends  to  infinity. 

Consider  a  sequence  of  partitions  such  that  y   £  Y   and  w. (y  )  =  — 
n         r  '     n      i  y     n 

for  each  i  and  for  each  n.   Standard  Rieraann  integration  theory  implies 
that 


n  1 

E   u(y  )w.(y  )  -*•  J  u(x)F'(x)dx  as  n  +  ». 


i-1    A   *        0 


We  will  show  that  as  n  *  «  the  cost  of  operating  t*(y  )  tends  to 

zero.   This  will  prove  that  the  value  of  the  search  procedure 

1 
(t*(y  ),y  )  tends  to  /  u(x)F'(x)dx.   This  is  sufficient  to  prove  the 

1    0 
theorem  because  /  u(x)F'(x)dx  is  clearly  an  upper  bound  on  the  problem. 

1   ° 
If  we  plug  in  —  for  w.(y  )  into  the  cost  function,  we  get 

n         1 

£  C  ln(n)  —  +  6(y,c)  which  equals  cln(n)  +  6(y,c).   Along  the  sequence 

i=l        n  n 

we  have  defined,  we  know  that  the  cost  of  t*(y  )  is  always  less  than 

1     2 
or  equal  to  —  +  ~~; — } — r  which  tends  to  zero.   Of  course,  if  we  are 
n   nln(n)  , 

using  optimal  procedures  for  each  c   the  payoff  tends  to  /  u(x)F'(x)dx 

0 
even  faster. 


□ 


5.   Examples  and  Qualitative  Analysis 
Example  1.   Let  u(x)  =  x  and 

F(x)  =  1-x  x  e  [0,1 

0  x  <  0 

1  x  >  1 
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Then  (6)  becomes, 

n 
(7)      max  max  E  (y .+cln(y . ,  -y . ) )(y. ,  -y . ) 

NY    1-1   1      1+1  X  1+1  X 

n 

There  is  a  unique  solution  to  (7)  which  must  be  of  the  form  y  = 

1   2 
(0,  — ,  — ,  ...,  1)  for  some  n.   The  problem  then  becomes 
n  n 


n   "-1  1 

(8)  max  E  (- —  +  cln(n))  -  or 

N   i=l   n  n 

(9)  max  f(n)  =  -^ cln(n) 

N  n 

+  lie 

Treating  n  as  R  we  find  the  optimum  is  at  n  =  7—  with  f'(n)  =  — r 

2c  2   n 

1  2n 

positive  to  the  left,  and  negative  to  the  right  of  -z— •   Since  n*  must 

be  an  integer  we  must  check  the  two  closest  integers  to  ~~~  to  find  n*. 
The  optimum  over  n  will  shift  from  n  to  n+1  when  c  satisfies 

(10)  SZl  _  cln(n)  =  —^   _  cin(n+i). 

2n  2n+2 


so  if 


> 


2(n2-l)ln(^V)    2(n2+n)ln(i1^) 
n-1  n 


then  n*  =  n.   On  the  upper  boundary  of  this  interval  n  is  exactly  as 

good  as  n~l,  and  on  the  lower  boundary  n  is  the  same  as  n+1. 

Define  a  function  v:  S  +  R  so  that  v(t,y)  gives  the  expected  value 
of  using  procedure  (t,y). 
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If  c  =  —  for  some  n  e  N,  then  n*  =  -r— .    and  v(t*(w(z(c) ) )  =  - —  c 
2n  2c  2 

+  cln  (2c).   Off  of  these  points,  v(t*(w(z (c) ) )  <  y  -  c  +  cln  (2c). 
z(c))  <  "5"  ~  c+cln(2c),  because  the  constraint  that  n*  be  an  integer  is 
binding.   Also 

dv(t*(w(z(c))),z(c))    .  .    . 

=  ln(.n;, 

dc 

for 


<  c  < 


_,  2   .   ,  n  .       ^,2   .   .n+1. 
2(n  -l)ln(— -)       2(n  +n)ln(— ) 
n-l  n 


and  for  n  e  N,  and  it  is  undefined  elsewhere. 

Figure  3  is  a  graph  of  (t*(w(z(c) ) ) ,  z(c))  as  a  function  of  c.   It 

,  x    1        ,  ,o  x     ,     .        11        1 
is  tangent  to  g(c)  =  ~ —  c  +  cln(2c)  at  the  points  c  =  — ,  — ,  ...,  r—, 

L  ho       Zn 

...  .   Just  to  the  left  of  a  tangency  point  v'(c)  >  g'(c),  and  just  to 
the  right  v'(c)  <  g'(c).   In  between  tangency  points  v  has  a  kink 
point.   v'(c)  is  smaller  to  the  right  of  a  kink  point  than  to  the 
left.   These  are  the  points  where  n*  changes.   So  when  c  decreases  to 

* —, —  n*  increases  from  n  to  n+1  and  v'(c)  decreases  from 

2(n  +n)ln(iL-L) 
n 

-ln(n)  to  -ln(n+l). 

2c 
To  check  the  validity  of  the  approximation,  we  need  ~i jC  t 

Zn 

where  n  is  n*(c)  and  t  is  the  tolerance  level  on  t(c)  from  page 

1  2 

If  2c  =  — ,  the  condition  is  n-lg(n)  _>  —  +  1. 

Table  1  shows  some  rough  calculations. 
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C  n  t 

207  .01 


1 
414 


_1 
92 

_1 
42 


46  .05 


21  .10 


Table  1 


n 


Example  2:   Sometimes  search  problems  in  R  are  really  just  one 

parameter  search  problems.   Suppose  there  is  a  firm  that  is  uncertain 

about  the  limits  of  its  technology.   The  true  feasible  set  G  belongs 

to  {A  }  a  e  [0,1],  where  A  is  a  subset  of  R  such  that  if  B  >  a  then 
a  an 

A    A  .   Let  F:  [0,1]  ■*■    [0,1]  have  a  continuous  and  strictly  negative 

p    ct 

first  derivative,  and  let  F(a)  be  the  probability  that  A  C  G. 

a 

n+1 
Let  V:  R    +  R  be  a  continuous  Von-Neumann-Morgenstern  utility 

function  of  the  form  V(r  ,  ...,  r  ,  r  , ,)  =  V(r  ,  ...,  r  )  -  r    . 

1        n   n+1       1        n     n+1 

where  if  X,  Y  e  R  ,  and  X  is  bigger  than  Y  in  every  coordinate  then 
n 

v(x)  >  v(y).   Then  there  exists  an  increasing  function 

t(a)  =  u(argmax  u(x)).   Suppose  t'  exists,  is  continuous,  and  greater 

A 
a 

than  zero  on  [0,1]. 

Suppose  the  firm  can  choose  any  a  e  [0,1]  and  learn  either  G   A 

or  G  /  A  ,  but  these  experiments  have  unit  cost  c.   Then,  letting 
a 

T:  [0,1]XR  +  R  be  defined  so  that  T(r  ,r  )  =  t(r  )-r  ,  the  problem  is 
the  one  parameter  problem  already  treated. 

Example  3:   We  know  there  is  always  an  interior  solution  to  (6). 
If  n*  e  N  is  optimal,  then  differentiating,  we  find 
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cF'(y.)  F'(y  ) 

(11)        wi(y)  +   U'(y!)  in<wi^^  +  ^173  ("(y^-^yi-i)) 


-  cln(w._..(y))  =  0  or  1=2,  ...,  n*. 

Suppose  I  C  [0,1]  is  an  interval  where  F'  is  extremely  large  com- 
pared to  u'/c   Then  we  can  drop  the  w. (y)  terms  and  the  cln(w.(y)) 
terms  in  (14)  and  still  have  a  decent  approximation  to  the  first  order 
conditions.   In  this  region  (say  from  n..  ,  •  «•,  n.)  we  get, 

u(yi)  -  u(yi_2) 

(12)     w.  =  w.  .exp  -  ( )    i  =  n,  ,  ...,  n. . 

ii~l  c  11 

This  means  that  w.  ,(y)  <  w  (y)  for  all  i  in  a  region  with  extremely 

high  concentration  of  probability. 

If  F"  <  0  then  y..,  ~   y.    <  w.(y)  -  w.  ,(y),  i.e.,  as  we  move  to 
yi+l    l    l       l-l 

the  right,  the  clustering  is  magnified.   So,  in  a  high  probability 
region  searching  is  more  intensive  to  the  right  than  to  the  left,  par- 
ticularly when  the  probability  concentration  is  increasing  as  we  move 
from  left  to  right. 

6.   Conclusion 

It  should  be  clear  from  the  above  presentation  that  even  in  one 
dimension  the  Bayesian  approach  to  planning  with  costly  interation  is 
rather  complicated.   It  appears  to  be  quite  difficult  to  generalize 
this  procedure  to  n  dimensions  although  it  should  be  possible. 

But  this  model  does  allow  us  to  make  some  important  points  about 
planning.   First,  iteration  is  not  free.   It  costs  time,  money,  and 
effort.   Second,  planners  have  some  prior  notions  about  what  should  be 
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possible  for  the  economy  and  they  make  use  of  these  ideas  when  they 
decide  how  they  will  gather  information.   It  would  be  interesting  to 
see  some  future  work  that  builds  on  these  notions. 
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Footnotes 


1See  Cave  and  Hare  (1981),  Heal  (1973)  and  Hurwicz  (1973)  for 
surveys. 

2 
Kornai  (1973)  has  an  excellent  critique. 

3 

Recently  Bennett  (1985)  has  studied  the  problem  of  incomplete 

iteration  in  a  totally  different  framework. 
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Appendix 
The  Hu-Tucker  Algorithm 

This  presentation  is  adapted  from  that  given  in  Knuth  (5)  with 

almost  no  alteration. 

Phase  1,  "continuation."  Start  with  a  sequence  of  terminal  nodes 

with  weights  q_  ,  q»,  •..,  q  •   Repeatedly  combine  two  weights  q.  and 
1   z        n  x 

q.  for  i  <  j  into  a  single  weight  q   +  q..   Write  a  new  internal  node 

with  weight  q.  +  q.  above  node  q,  and  with  successors  q.  and  q..   This 
&   ^i   ^j  ni  Mi     nj 

combination  is  done  to  the  unique  pair  of  weights  (q.,q.)  satisfying: 

i)  No  terminal  node  occurs  between  q.  and  q.. 

i      J 

ii)  q,+q.  is  minimal  overall  (q.,q.)  satisfying  rule  (i). 
iii)  The  index  i  is  minimum  over  all  (q.,q.)  satisfying  rules  (i), 
(ii). 
iv)  The  index  j  is  minimum  over  all  (q.,q.)  satisfying  rules  (i), 
(ii),  and  (iii). 
When  this  procedure  is  finished,  a  binary  tree  has  been  con- 
structed.  There  is  a  clear  economic  intuition  behind  it,  which  is 
that  the  path  lengths  to  the  terminal  nodes  that  are  most  likely,  are 
short.   By  always  combining  the  nodes  with  the  smallest  sum  of  weights 
we  ensure  that  the  longest  paths  lead  to  the  most  unlikely  terminal 
nodes.   The  only  problem  with  the  tree  constructed  this  way  is  that  it 
does  not  preserve  the  ordering  of  terminal  nodes.   It  allows  us  to 
partition  the  terminal  nodes  into  any  two  sets  we  want.   But  we  need 
to  restrict  our  attention  to  procedures  that  only  allow  us  to  pick  a 
terminal  node  and  learn  whether  the  true  terminal  node  lies  above  or 
below  our  choice. 
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Figure  4  gives  an  example  of  a  tree  constructed  using  the  phase  1 
procedure.   Note  that  the  line  connecting  a  node  of  weight  10  with  the 
node  of  weight  15  intersects  with  the  line  connecting  the  terminal 
node  of  weight  9  with  the  node  of  weight  15.   The  contribution  of  Hu 
and  Tucker  is  to  show  that  this  tree  can  be  transformed  into  another 
tree  that  preserves  the  ordering  on  terminal  nodes  (i.e.,  lines  con- 
necting points  do  not  intersect)  which  has  the  same  expected  cost  of 
operation. 

Phase  2,  "level  assignment"  when  phase  1  ends,  there  is  a  single 
node  left  in  the  working  sequence.   Mark  it  with  level  0.   Then  undo 
the  steps  of  phase  1  in  reverse  order,  marking  level  numbers  of  the 
corresponding  tree.   If  a  given  node  has  level  I,    then  the  two  nodes 
that  formed  it  have  level  Z+l. 

Phase  3  "recombination."   Now  we  have  a  working  sequence  of  ter- 
minal nodes  and  levels, 


1  2,...,  n 

The  internal  nodes  used  in  Phases  1  and  2  are  now  discarded  and  we 

create  new  ones  by  combining  weights  (q.,q.)  according  to  the 

following  new  rules: 

(i)  The  nodes  containing  q.  and  q.  must  be  adjacent  to  the 

working  sequence. 

(ii)  The  levels  I.    and  I     must  both  be  the  maximum  among  all 
i      J 


remaining  levels. 
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(iii)  The  index  i  must  be  minimum  over  all  (q.,q.)  satisfying  (i), 

(ii). 

The  new  node  is  assigned  level  number  i.~l»      The  binary  tree 

formed  during  this  phase  has  minimum  weighted  path  length  over  all 

binary  trees  whose  external  nodes  are  weighted  q.  ,  •••,  q   from  left 

1        n 

to  right. 

Figures  4  and  5  show  an  example  of  the  algorithm.   In  phase  1, 

nodes  are  formed  in  the  order  4,  5,  10,  10,  13,  15,  21,  28,  49.  To 

the  left  of  each  node  is  a  number  giving  its  level.   The  reader  is 

referred  to  Hu  and  Tucker  for  a  proof  of  the  optimality  of  this  con- 
struction. 
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